The Silent Persuader: Geoffrey Hinton Warns of AI's Emotional Manipulation
'Geoffrey Hinton warns that AI may surpass humans in emotional persuasion, calling for labeling, regulation, and improved media literacy to counter subtle manipulation.'
Records found: 20
'Geoffrey Hinton warns that AI may surpass humans in emotional persuasion, calling for labeling, regulation, and improved media literacy to counter subtle manipulation.'
Anthropic's Claude has surpassed OpenAI in the enterprise AI market, capturing a 32% share by focusing on trust, compliance, and integration, reshaping the future of AI adoption in businesses.
'Rubrics as Rewards (RaR) introduces a reinforcement learning approach that uses structured rubrics as reward signals, improving language model training in complex domains like medicine and science.'
New research shows that adding context to ambiguous user queries significantly improves AI model evaluation, revealing biases and even reversing model rankings.
FlexOlmo introduces a modular framework that allows training large language models on private datasets without data sharing, achieving strong performance while respecting data governance and privacy constraints.
This tutorial shows how to use MLflow to evaluate Google Gemini's responses on factual prompts with integrated metrics, combining OpenAI and Google APIs for comprehensive LLM assessment.
MIT and NUS researchers introduce MEM1, a reinforcement learning framework that enables language agents to efficiently manage memory during complex multi-turn tasks, outperforming larger models in speed and resource use.
Meta and collaborators developed a novel framework to accurately quantify how much language models memorize training data, estimating GPT models store around 3.6 bits per parameter and providing new insights into memorization versus generalization.
Tool-augmented AI agents enhance language models by integrating reasoning, memory, and autonomous capabilities, enabling smarter and more reliable AI systems.
NVIDIA introduces ProRL, a novel reinforcement learning method that extends training duration to unlock new reasoning capabilities in AI models, achieving superior performance across multiple reasoning benchmarks.
The Deep Research Bench report by FutureSearch evaluates AI agents on complex research tasks, revealing strengths and key limitations of leading models like OpenAI's o3 and Google Gemini.
Microsoft's Phi-4-reasoning demonstrates that high-quality, curated data can enable smaller AI models to perform advanced reasoning tasks as effectively as much larger models, challenging the notion that bigger models are always better.
Researchers from the National University of Singapore developed Thinkless, a framework that dynamically adjusts reasoning depth in language models, cutting unnecessary computation by up to 90% while maintaining accuracy.
PARSCALE introduces a parallel computation approach to scale language models efficiently, reducing memory use and latency while improving performance across various tasks.
New research reveals how integrating in-context learning insights into fine-tuning datasets significantly improves language model generalization on complex reasoning tasks.
The FalseReject dataset helps language models overcome excessive caution by training them to respond appropriately to sensitive yet harmless prompts, enhancing AI usefulness and safety.
New research shows that including toxic data in LLM pretraining improves the model's ability to be detoxified and controlled, leading to safer and more robust language models.
RLV introduces a unified framework that integrates verification into value-free reinforcement learning for language models, significantly improving reasoning accuracy and computational efficiency on mathematical reasoning benchmarks.
Researchers at UC Berkeley and UCSF have developed Adaptive Parallel Reasoning, a novel method that allows large language models to dynamically distribute inference tasks across parallel threads, enhancing reasoning performance without exceeding context window limits.
New research shows that specialized reasoning models combined with efficient inference-time scaling methods like majority voting outperform non-reasoning models in complex tasks, offering insights into optimizing computational resources.